Skip to content

feat: add cpu/cuda config for prompt guard #2194

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 28, 2025

Conversation

mhdawson
Copy link
Contributor

@mhdawson mhdawson commented May 16, 2025

What does this PR do?

Previously prompt guard was hard coded to require cuda which prevented it from being used on an instance without a cuda support.

This PR allows prompt guard to be configured to use either cpu or cuda.

Closes #2133

Test Plan (Edited after incorporating suggestion)

  1. started stack configured with prompt guard as follows on a system without a GPU
    and validated prompt guard could be used through the APIs

  2. validated on a system with a gpu (but without llama stack) that the python selecting between cpu and cuda support returned the right value when a cuda device was available.

  3. ran the unit tests as per - https://github.com/meta-llama/llama-stack/blob/main/tests/unit/README.md

Previously prompt guard was hard coded to require cuda which
prevented it from being used on an instance without a cuda
support.

This PR allows prompt guard to be configured to use either cpu
or cuda.

Signed-off-by: Michael Dawson <[email protected]>
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label May 16, 2025
@@ -75,7 +75,7 @@ def __init__(
self.temperature = temperature
self.threshold = threshold

self.device = "cuda"
self.device = self.config.guard_execution_type
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we just check if cuda is available and use that otherwise use CPU? no need for a specific configuration like this to be added.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ashwinb I'll take a look at that and update

Copy link
Contributor

@ashwinb ashwinb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

requesting changes for my inline comment

Signed-off-by: Michael Dawson <[email protected]>
@mhdawson mhdawson requested a review from bbrowning as a code owner May 26, 2025 17:50
@mhdawson
Copy link
Contributor Author

@ashwinb updated based on your suggestion. Thanks for taking the time to review my PR.

Copy link
Contributor

@ashwinb ashwinb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool

@ashwinb ashwinb merged commit a654467 into meta-llama:main May 28, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Not possible to use CPU inferance with prompt-guard - intentional?
3 participants